Skip to content

FEAT: expose explicit data shape metadata in load_data_source#417

Open
biru-codeastromer wants to merge 1 commit intosktime:mainfrom
biru-codeastromer:feat/expose-data-shape-and-scitype-metadata
Open

FEAT: expose explicit data shape metadata in load_data_source#417
biru-codeastromer wants to merge 1 commit intosktime:mainfrom
biru-codeastromer:feat/expose-data-shape-and-scitype-metadata

Conversation

@biru-codeastromer
Copy link
Copy Markdown
Contributor

Summary

Closes #416.

This adds explicit agent-friendly shape metadata to load_data_source and load_data_source_async responses so MCP clients can reason about loaded data handles without guessing from raw column names and dtypes.

What changed

  • add a shared metadata builder in Executor
  • expose:
    • target_scitype
    • target_variates
    • has_exog
    • exog_variates
    • index_type
    • n_target_columns
    • n_exog_columns
  • apply the same metadata enrichment to both sync and async load paths
  • add focused tests for:
    • datetime-indexed target with multivariate exogenous data
    • range-indexed target without exogenous data

Why this matters

This makes loaded data self-describing for agents before they choose estimators or workflow branches. In particular, it reduces ambiguity around:

  • whether a handle is univariate or multivariate
  • whether exogenous variables are present
  • whether the time index is datetime-like or integer/range based

Validation

  • .venv/bin/ruff check src/sktime_mcp/runtime/executor.py tests/test_data_sources.py
  • .venv/bin/ruff format --check src/sktime_mcp/runtime/executor.py tests/test_data_sources.py
  • .venv/bin/pytest tests/test_data_sources.py -q
  • .venv/bin/pytest -q

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] expose explicit data shape metadata for agent reasoning

1 participant